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The optimal capacity of a diluted Blume-Emery-GrifRths neural network is studied as a function 
of the pattern activity and the embedding stability using the Gardner entropy approach. Annealed 
dilution is considered, cutting some of the couplings referring to the ternary patterns themselves and 
some of the couplings related to the active patterns, both simultaneously (synchronous dilution) or 
independently (asynchronous dilution). Through the de Almeida-Thouless criterion it is found that 
the replica-symmetric solution is locally unstable as soon as there is dilution. The distribution of 
the couplings shows the typical gap with a width depending on the amount of dilution, but this gap 
persists even in cases where a particular type of coupling plays no role in the learning process. 
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I. INTRODUCTION 

During recent years the Blume-Emery-GrifRths (BEG) 
model [l| has been studied quite intensively in the con- 
text of neural networks, one of the reasons being that it 
was argued in 0] that this model maximizes the mu- 
tual information content of three-state networks with 
Hebbian-type learning rules. To know in more detail 
how the retrieval quality of the BEG network compares 
with other three-state neuron models, the thermodynam- 
ics of this model was studied and temperature-capacity 
phase diagrams were obtained 0. It was shown that 
the retrieval phase is systematically larger than that of 
other three-state models and that the critical capacity 
is about twice as large as that of the three-state neu- 
ron Ising model Also the region of thermodynamic 
stability is much larger and, furthermore, the phase dia- 
gram itself is much richer with the presence of a stable 
quadrupolar state, carrying also retrieval information, at 
high temperatures. 

It was also shown that this enhancement of the re- 
trieval properties is not restricted to the use of the Heb- 
bian learning rule but that it is inherent to the model. 
Indeed, by studying the Gardner optimal capacity P| in 
replica symmetric (RS) mean-field theory it was found 
recently [g that for the corresponding BEG perceptron 
with, e.g., zero embedding stability parameter and uni- 
form patterns this capacity is 2.24. Comparing with 
other three-state neuron perceptron models, we recall 
that for the Q — 3 Ising perceptron the Gardner opti- 
mal capacity can maximally reach 1.5 whereas for 
the Q = 3 clock and Potts model both reach an opti- 
mal capacity of 2.40 9, 10]. At this point we have to 
remark that the Q = 3 Ising perceptron and the BEG 
perceptron have the same topology structure in the neu- 
rons, whereas the Q = 3 clock and Potts models have 
different topologies. For the Ising topology structure the 
BEG-perceptron has the best performance. 
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The interesting question remains whether and in how 
far these enhanced retrieval properties are robust against 
dilution. Studying this question is the aim of the present 
work. Besides the fact that the connectivity of biological 
networks is far from complete, diluted networks offer the 
possibility to study the robustness against malfunction- 
ing of some of the connections. Furthermore, in asym- 
metric architectures they reduce the internal feedback 
correlations of fully connected networks making a com- 
plete analytic description of the dynamics much easier 
Q J [ill IT^ I . Finally, in the BEG perceptron there are 
two sets of couplings, those referring to the three-state 
patterns themselves and those related to the active, i.e., 
the non-zero patterns. By diluting both types of cou- 
plings simultaneously or diluting these couplings inde- 
pendently, we can study, in particular, the influence of 
the active patterns on the Gardner optimal capacity of 
the BEG perceptron. These results can be obtained in 
closed analytic form. 

We remark that the type of dilution we study in this 
paper is such that the number of connections to a given 
site still increases with the size of the system. In the 
replica approach to capacity problems for these systems, 
only order parameters with two replica indices appear. 
Recently, the study of neural networks with finite con- 
nectivity, i.e., where the number of connections to a given 
site remains finite in the thermodynamic limit has been 
started [l^ll4|. There, functional order parameters have 
to be introduced. 



The paper is organized as follows. In Sect. we recall 
the BEG model and briefly discuss some of its properties. 
In Sect, mil we introduce the different kinds of dilution 
that one may study and report on the application of the 
Gardner approach to these cases. We present the results 
for the optimal capacity in the RS approximation as a 
function of the pattern activity, the stability parameter 
and the degree of dilution. In Sect. IIVI we discuss the 
results for the distribution of the couplings and in Sect.lVl 
we study the validity of the local stability criterion for the 
RS solution. The last section contains the conclusions. 
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II. THE BEG NEURAL NETWORK 

Let us consider a neural network consisting of N neu- 
rons which can take values (Tj, i = 1, . . . , A'^ from the dis- 
crete set S = {—1,0,-1-1}. The p patterns to be stored 
in this network are supposed to be a collection of in- 
dependent and identically distributed random variables 
(i.i.d.r.v.), ^j^, II = 1, . . . ,p, taken from the set S with a 
probability distribution 

Pi^t) = ^Si^^ - 1) + ^'5(er + 1) + (1 - a)<5(er) (1) 
with a the activity of the patterns so that 



(2) 



Given the network configuration at time t, ajq = 
{(Tj(t)}, j — 1, . . . ,N, the following dynamics is consid- 
ered. The configuration (tj^(0) is chosen as input. The 
neurons are updated according to the stochastic parallel 
spin-flip dynamics defined by the transition probabilities 



Pr {a^{t + l)^s' eS\(TNit)) 

_ cxp[-Pe,{s'\eTN{t))] 



(3) 



Ese5exp[-/3e,(s|crAr(t))] 

Here the energy potential ei[s|(T7v(t)] is defined by 

e,[s|crjv(i)] = -sh,{(TN{t)) - s^e,{aN{t)) , (4) 

where the local fields in neuron z, h]^^i{t) = hi{aiq{t)) 
carry all the information 

hN,^ (t) = (^) ' = H ■ 

At zero temperature the updating rule of this dynamics 
Q-Q is equivalent to the gain function formulation 

cr,(t + l) = sign{hNAt)M\hNAt)\+0NAt)) 

= g(V,(i),0iv,*(i)) (6) 

with 0(x) and sign(a;) the Heaviside and the sign func- 
tion, respectively. 

Concerning the loading capacity of this model, the 
following results have appeared in the literature. For 
Hebbian-type synaptic couplings Jij and Kij 



1 ^ 
a^N ^ ^1 



J a — 



(7) 



a] (8) 



the long-time behavior is governed by the Hamiltonian 



and the retrieval properties are enhanced Q in compari- 
son to other three-state neuron models. In particular, the 
retrieval phase is systematically larger than that of other 
three-state models and the critical capacity is about twice 
as large as that of the three-state neuron Ising model. 
Moreover, depending on the value of the pattern activ- 
ity a stable quadrupolar state carrying also non-zero re- 
trieval information arises at high temperatures. However, 
an underlying reason why there is such an enlargement 
of the basin of attraction and hence of the retrieval prop- 
erties of the network seems still to be absent. 

This enhancement of retrieval has also been found 
for the BEG-perceptron 

Co^= sgn(/i'')e(|(/i^|+0'^), Va* = 1,...,p (10) 

with denoting the output, and where and are 
the local fields at the output created by the pattern 



N 



N 



h^' = ^y^ui, r = ^Vi^,(cf)2 (11) 

^ l—l ^ i—1 

with Ji, Ki a set of couplings connecting the input with 
the output. In a RS analysis the Gardner optimal ca- 
pacity for this perceptron is calculated analytically and 
seen to be bigger than that of the Q — 3-Ising percep- 
tron 0,11. 



III. THE DILUTED BEG PERCEPTRON 

We want to find out in how far these enhanced re- 
trieval properties are robust against dilution. One of 
the questions we want to answer then is the following. 
Let £^^,fj. — l,...,p,i = 0,...,N be an extensive set of 
p — aN patterns supposed to be fixed points of the dy- 
namical rule (|10|l where the local fields /i'' and 6*^ are 
now given by 



N 



1 



N 



H\2 



(12) 

The parameters cf G {0, 1} and cf G {0, 1} control 
the presence of the connections Ji and Ki. We want to 
find a set of couplings, J*,K*, or equivalently, a BEG- 
perceptron with an average dilution cj and ck 



cj 



1 ^ 1 ^ 



(13) 



i=l 



that still fulfil the conditions H10() . It is clear that for 
small values of the capacity a more than one BEG- 
perceptron storing these patterns can be found. The big- 
ger the value of a the more difficult this task becomes 
and a saturation limit, called Gardner optimal capacity, 
is reached. 

In the following we study dilution during learning, i.e., 
annealed dilution, which can be realized in two different 
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ways. The first one, called synchronous dilution, assumes 
that cl = cf = Ci and, hence cj = ck', the second one, 
named asynchronous dilution, allows the q's to be dif- 
ferent. Looking back at (|12ll we see that the c/ control 
the connections of the three-state patterns, while the 
control the connections related to the active, i.e., the 
non-zero patterns. In fact, for Hcbbian learning in l(7|l- 
the if-couplings control the fluctuations around these 
active patterns. Therefore, by allowing synchronous or 
asynchronous dilution we can study the influence of the 
active patterns on the optimal capacity of the BEG per- 
ceptron. 



A. Synchronous dilution 

To study the optimal capacity, we follow the entropy 
approach introduced by Gardner Since the dynam- 
ical variables are continuous, entropy has only meaning 
relatively and we write the volume V of all possible BEG- 
perceptrons satisfying (|10|l . without normalizing, as 



N p 



■ Ci.Ji,Ki 
i—1 /X— 1 



(14) 



with x^j^{h'^, 1^) the characteristic function given by 

+[i-{mQHh^\-o^-^) (15) 



where k is the embedding stability parameter. Since 
we consider continuous couplings we need to introduce 
a modified spherical constraint 



N 



1=1 



N 
i=l 



cN . 



(16) 



From this spherical constraint we see that the couplings 
are not well normalized at those sites where Ci is zero. 
One can solve this difficulty either by introducing an ex- 
tra spherical constraint for the remaining couplings |l5j |. 
either by restricting the trace over the couplings lq|. We 
take the second solution and define the restricted trace 
as 

,J'"^.(-")= E '^=.,0(---)+ E '^=-1 jdJ^dK^{■■■). 



Ci=0,l 



Ci=0,l 



(17) 

Since we want to study typical features of the system 
the important quantity to average over is the entropy. 
Employing replica techniques (IJi] we express the entropy 
per neuron as 



V = lim lim — ln((y")) 



(18) 



where • )) denotes the average over the pattern distribu- 
tion Q and where is the n-th times replicated volume 
of solutions 



N n n N N N p n 



=1 



The further analysis then proceeds in a standard way 
although the technical details are much more involved. 
A short account is given in Appendix A. 

The results are described essentially in terms of three 
order parameters, the first one, g^^, defined as the over- 
laps between two distinct replicas for the couplings J^, 
the second one. Tap, a similar quantity for the couplings 
Ki and the third one, L", arising from the fact that the 
dynamics and, hence, also the characteristic function con- 
tains a second field 6, quadratic in the patterns (see 148|l l. 
In the RS approximation we are discussing here they are 
given by qafs = q, Tap = r, 1°" = L. 

The RS Gardner optimal capacity is obtained when 
the overlap order parameters q and r go to 1. It is clear 
that these limits have to be taken simultaneously but, 
in general, their rate of convergence could be different. 
Therefore, we introduce (1 — r) = 7(1 — ?) where 7 is a 
new parameter which one also needs to extremize. We 
expect this parameter 7 to depend on the pattern distri- 



bution through the activity a. The result for the replica 
symmetric Gardner optimal capacity a^"^ then reads 



a-(a,.,c) = extr ^-("'^'^^ 



(20) 



u.L.-f g{'y,L;a,K) 

where the function Agyniu, c; 7) is defined by 

A,yn{u,r,c)^u^c + A(^l{u,j). (21) 

Stationarity with respect to u then leads to c = 
^syn(w,7). Here the functions Ai^2, m = 1,2 are given 

by 

A^Kun) - ^ |^'^,^.exp[-^(cos^^ + 7sinV)] 



2tt 



cos^ <P + 7 sin^ (p) 



(22) 

The function (7(7, L; a, k) in H2U|) can be expressed as 
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^=l 

3 f 

+{l-a)J2 / V{hoM^eo-Qd^„,iho,9o) 



(23) 



with U = {aL + K)/y^a{l - a) and V{ax + b) = {2Tr)-^/'^acxp[{-l/2){ax + b)'^] dx. 
The integration regions are the following ones 



7^l = 



ho<0 
Oo>0 



-Oo/y <ho< Ooli 



hn>0 



-hah' <0o< I'ha ^ ' 
and the corresponding integrands are given by 



mm 



1 + (Y) 



_ u2 ,n2 



^0 <0 
ho<0 



2 d™i 



1 + (7')^ 



(24) 
(25) 

(26) 
(27) 



r 



with 7' = ^ 7(1 — a). 

After inserting H23|) - (|27|l in (|20|l and extremizing nu- 
merically we find the results presented in Fig. f . We plot 
the optimal capacity a{a^ k, c) itself (insets) and its val- 
ues normalized by the optimal capacity for no dilution, 
a{a,K,c)/a{a,K,l), as a function of the dilution c for 
K — and several values of the activity a. 

We see that different regions of activities lead to differ- 
ent results. For small activities a < 0.5 and hence, many 
inactive neurons, the optimal capacity strongly increases 
for decreasing dilution. This seems to be in agreement 
with what is known in the literature for very small activ- 
ities or so-called sparse coding (see, e.g., 0, [20|). 
When normalizing these results by a{a, k, 1) we find that 
all the lines collapse into the full line. For large activities 
a > 0.6 and, hence, many active states ±1, the results for 
a (a, K, c) are only weakly dependent on the activity (see 
inset) but the results for the normalized optimal capacity 
do not collapse. Furthermore, we see that the network is 
more robust against synchronous dilution for non-sparse 
coding, i.e., for activities ranging in the interval [ 0.2, 1.0 ] : 
the dilution c can decrease from 1 to about 0.4 before one 
sees a substantial decrease in the optimal capacity. Com- 
paring to the Q = 3 Ising perceptron |0, the effect of 
dilution, especially for larger activities is about the same. 

When c = 1 (u = 0), the functions Ai™^{u,j) can be 
explicitly integrated leading to 



4i)„(0,7) = l, 45^0,7) = 1 + 



(28) 



and one recovers the optimal capacity found in the fully 
connected case @. When the pattern activity a goes to 
1 the system is forced into two possible states, as in the 
Gardner model with dilution 'lal|. Since the overlap pa- 
rameter r becomes irrelevant in such a limit 7 must go to 
infinity. The numerical solution does confirm this. Fur- 
thermore, in this limit the functions A^^^ {u, 7) become 



4yUw,oo) ^ crfc(^-^) 



Ai%{u,oo) 
and hence 



2u 



exp(-^) + (1 - u^)crfc^— = 1 



Asyniu, 00; c) = C -I- 



2u 



exp 



(4) 



(29) 



These are precisely the Gardner results with dilution [Tsf 
after rescaling u/V2 u. We remark that in this case, 
and also for the Q- Ising type models Ifl], it is possible 
to rescale the optimal capacity as follows 



1) 



2u 
s/2^ 



exp("^) 



(30) 



with c = evic{u/^/2). For the general BEG-perceptron 
treated here such a scaling is not possible because the 
factor 7 appears both in the numerator and denominator 
of Ea. (|20|l . It is possible, however, to derive the bound 



c log c < 



aRs{n,a,c^ 1) 



2u . , . , 
<c+-=exp(--) (31) 
v Zn ^ 
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FIG. 1: Optimal capacity for synchronous dilution in the 
BEG-perceptron as a function of c for k = and several 
values of a. Top figure: the normalized optimal capacity 
(solid line), its upper bound (dashed-dotted line) and its 
lower bound (broken line); the inset displays a(a,K,c) for 
a = 0.1, 0.2, 0.3, 0.4, 0.5 from top to bottom. Bottom figure : 
similar to the top figure for a = 0.6, 0.7, 0.8, 0.9, 0.95, 0.99 . 
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FIG. 2: Optimal capacity for asynchronous dilution in the 
BEG-perceptron. Top figure: the normalized optimal capac- 
ity as a function of cj for ck = 1, k = 0.0 and activity 
a = 0.1,0.3,0.5,0.7,0.9 from top to bottom. The dashed- 
dotted line is for a = 1. Bottom figure: the normalized ca- 
pacity as a function of ck for cj = 1, k = 0.0 and pattern 
activity a = 0.1, 0.3, 0.5, 0.7, 0.9 from bottom to top. 



for < o < 1 with c and u related through 
c = erfc(u/\/2). These bounds are shown in Fig. 1 as 
the broken line (lower bound) and the dashed-dotted line 
(upper bound). Although the dependence on the dilution 
and other parameters is not that simple, we do find that 
the dependence on the embedding stabihty parameter k 
is rather weak. 



DC, , 

extr — 1 (32) 

ujr,UK,L,7 g[j, L;a, K) 



with 



B. Asynchronous dilution 

In this case cj is different from ck allowing us to study 
the relative influence of the two sets of couplings. An 
analogous calculation as the one in subsection A can be 
done leading to the following result for the optimal ca- 
pacity 



^asyn(w;c) = CW -|-A/-Mexp( —) + 2{1 - U )H{u) 

V TT Z 

(33) 

Hiu)^-^J dxexp(--). (34) 

To simplify notation we denote both uj and as u in 
the sequel since there should be no confusion possible. 
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Stationarity with respect to u leads to c = erfc (u / \/2) • 
We remark that when we take the dilution averages to be 
equal, i.e., cj = ck = c the dependence on the dilution in 
(|32|l factorizes and we simply get an expression equivalent 
to 



a 



(a, K,c= 1) 



RS 



^asyn(w; c) 



(35) 



for any value of the pattern activity a and stability con- 
stant K. 

In order to understand the role of the different cou- 
plings in the learning process, we cut them independently 
and study the influence with varying activity. The results 
are presented in Fig. 2. We plot the optimal capacity nor- 
malized by its value for no dilution, Q!(a, k, c)/a(a, k, 1) 
as a function of the dilution of the J-couplings, cj, for 
Cif = 1, K = and several values of the activity a (top) 
and, analogously (bottom) as a function of the dilution 
of the if-couplings. We find that when diluting the J- 
couplings, referring to the ternary patterns, and keeping 
all the iiT-couplings, related to the active patterns, the 
normalized capacity decreases as a function of the activ- 



J 



ity obtaining the Gardner result for a = 1. When doing 
the reverse, the normalized capacity increases as a func- 
tion of the activity. Moreover, the network is more robust 
against if-dilution, especially for large activities. This 
seems to be quite natural since large activities means 
many active states ±1 such that cutting active patterns 
becomes relatively less important. 



IV. DISTRIBUTION OF COUPLINGS 

We study the distribution of couplings p( J, K) inside V 
in analogy with This probability distribution can be 
splitted into two parts, the first one involving the (1— c)iV 
inactive couplings and the second one, pr{J,K), repre- 
senting the remaining cN active couplings. Obviously, 
the first set of couplings is delta distributed so that we 
can write 

p{ J, K) = il- c)5{J)5{K) + pr{ J, K) (36) 
where the second set of couplings satisfies 



N 



N p 



i=l n=l I 
I 



(37) 



In order to compute pr{J,K) we follow by intro- Evaluating the expression within the RS approximation 
ducing replicas allowing us to lift the volume V to the we get for synchronous dilution 
numerator. The calculations are standard but tedious. 



J 



Pr,syn[J,K)^ 27rc(l + 7) 



7^y„(w,7;c) . 2 , r^2\ 



2c(H-7) 



r 



e 



7 ^syn(M,7; c) / K'^ ' 

c(l+7) 



(38) 



This distribution is a two-dimensional Gaussian from 
which the middle section has been cut out, as represented 
by the Heaviside function. This gap has an ellipsoidal 
shape because of the scaling factor I/7 accompanying 



J 



the in the argument. It increases with increasing di- 
lution to reach its maximum when c tends to zero. In the 
limit 7 — > (X) this distribution reduces to 



lim Pr,syn{J,K) 



7 — >-oo 



Asyn(M,0O;c) 

27rc 



exp 



^syn('"; 00; c) ^ ^2~J 



r 



(39) 



We remark that this distribution is different from the because, although the K couplings do not play any role 
one obtained in the Gardner case (i.e., the a —> 1 limit) 
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for 7 ^ oo the spherical constraint is still present, no 
matter what the value of a is. 

It is interesting to determine how this probability dis- 
tribution behaves in the case of no dilution. Then, the 
distribution H38() for the couplings becomes Gaussian 
without a gap, viz. 




J 



This result is intuitively meaningful since the couplings 
are forced to obey only the spherical constraint without 
any restriction coming from the dilution variable. There- 
fore, we find back the probability distribution for the 
couplings of the fully connected BEG-perceptron. 

For asynchronous dilution a similar treatment can be 
pursued and we find that the probability distribution for 
the couplings factorizes 



^ -J [ ' ) ® r'-"V^a..n(c.,..)j ^''^ 

V27r V Cif \ 2CK J \ \l Ai,syn{CK,UK) J 



These distributions are of a similar nature as the one for 
the standard diluted perceptron case . 



V. DE ALMEIDA-THOULESS STABILITY 

Finally, we are interested in studying the local sta- 
bility of the obtained solutions against RS fluctuations 
following ItR . From the work on the non-diluted 
BEG-perceptron 0| we recall that in that case the solu- 
tions are unstable only for small activities and very small 
embedding constants k. Furthermore, we know that, in 
general, there are four transverse eigenvalues. In the case 
of asynchronous dilution these eigenvalues are given by 
the roots of the fourth degree characteristic polynomial 



P(A) 



A 



Cj 







CK 



cj 






CK 



Af - A 



(43) 



where A^^ and Ap read 



Ay 



D(x) i^log[l]t(x,i?,g,Vj)]' (44) 

^(^) f-^^^'&iM^^P^^^M]^ (45) 



with E, F, (f>j and i/'k the conjugate variables appearing 
in the integral representations of the constraints, q and f 
the conjugate variables of the order parameters q and r, 
and with the short-hand notation 



[l]^{x,a,b,d) = 1 + 



2tt 



exp 



2{a-b) 



Similar expressions can be written down for A^, A^ and 
Ac but they are not needed for the argumentation. In- 



deed, it is straightforward to check that as soon as dilu- 
tion is allowed the solution becomes unstable in the sat- 
uration limit q ^ 1. The first derivative of [l]f (a;, a, 6, d) 
has a jump at a; = u proportional to u, leading to a dirac 
delta contribution in the second derivative. The square 
in 1)44(1 and H45|l forces the replicon eigenvalue to go to 
-|-oo, similarly to what happens for the standard percep- 
tron model as explained in When u = 0, i.e. in the 
absence of dilution, there is no such delta contribution 
and we find back the results of The same reasoning 
holds for synchronous dilution. 



VI. CONCLUSIONS 

In this work we have studied annealed dilution in the 
BEG perceptron model. Two types of dilution have been 
discussed, the first one being synchronous dilution, i.e., 
simultaneous dilution of some of the couplings referring 
to the ternary patterns themselves and some of the cou- 
plings related to the active patterns, the second one being 
dilution of both these types of couplings independently, 
so-called asynchronous dilution. We have obtained an an- 
alytic formula for the replica symmetric Gardner optimal 
capacity. For synchronous dilution we see that different 
regions of activities lead to different results. For small 
activities a < 0.5 the optimal capacity strongly increases 
for decreasing dilution but normalizing these results by 
its value for no dilution, the lines for different activities 
collapse. For large activities a > 0.6 the optimal storage 
capacity is only weakly dependent on the activity but 
the results for the normalized optimal capacity do not 
collapse. Furthermore, we see that the network is ro- 
bust against synchronous dilution for non-sparse coding, 
i.e., for activities ranging in the interval [0.2,1.0]. For 
asynchronous dilution we find that diluting only the J- 
couplings, the normalized optimal capacity decreases as 
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a function of the activity obtaining the Gardner result for 
a = 1. When diluting the /C-couplings, the normalized 
optimal capacity increases as a function of the activity. 
Moreover, the network is more robust against if-dilution, 
especially for large activities. Since the effects of dilution 
are of the same order as those in the Q = 3-Ising model, 
these results also confirm the better retrieval properties 
found before for the BEG model. 

We have studied the stability of the RS solution against 
RS breaking fluctuations by generalizing the de Almeida- 
Thouless analysis. We find that as soon as there is dilu- 
tion the results are unstable. 
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After defining the order parameters 



QafS 



-^yc"Kf, Va (46) 

1 ^ 

^Es"^;s'^/' «</^ (47) 

1 ^ 

7]^Es"^^'^f' "</5 (48) 



APPENDIX A 



In this appendix we outline the main steps in the cal- 
culation of the entropy per neuron H18() - (|19|l . 



introducing the conjugate order parameters ,qaf3,raf3, 
and enforcing the constraints H13|l and (|16|l using the La- 
grange multipliers i?", F" and -0", we write ((l^")) as 
the following integral 



Q</3 a=\ I ^ 



where we have defined the functions 



G\ = a log 



[n 



dh'^dh"! r A de^'dO' 



2tt 



a(l — a) 



Q,/3 = l 



n n 



a,S3=\ 



n 

G2 = log n tr{e=,j=,if=} exp [ - ^ q^pcTc^J^ - ^ ?^pc^K^c^K^ 



a<l3 



a— 1 a— 1 a— 1 

n 

Q</3 a</3 Q=l 



r 



(49) 



(50) 



(51) 
(52) 



We have already used that = 0, Va at the saddle- 
point. In the thermodynamic limit — > oo the entropy 
is evaluated at the saddle-point for the order parameters 
(|48|l . the conjugate ones and the Lagrange multipliers 
i?", F" and Using the RS ansatz for the order pa- 



rameters 

= i E'^ =E F" =F 4'°' (53) 
9a/3 = q raf} = r qaf3 ^ q Tap = r (54) 

the functions Gi,G2,G3 can be simplified further and 



the entropy can be written as 



V 



c ^ c 
-qq — —rr 



2^^ 2 2 
with the short-hand notations 



E + F + tp 



Vix,y)\og[l]^{x,y) + a I?(/io, - 0(( log[l]«(/io, ^o))) (55) 



[l]t(x,2/) - 1 + 



2tt 



^{E-q){F-?) 

dh de 



exp 



2 — 

X q 



1^ 
y r 



V2^(l-g) V2^(l-r) 



2{E - q) 2{F - f) 2 

2(1 -g) 2(1 -r) 



(56) 
(57) 



where the integral in (|57|l is restricted to the 
region f2j given by the characteristic function 
Xi{h^,e^a{l - a);K) defined in The RS 

Gardner optimal capacity is then reached when g, r go 
to 1. 

At this point we have two choices to proceed. Either 
we solve numerically the saddle-point equations or we do 
an asymptotic expansion in the limit g, r — > 1 in the en- 
tropy (|55|l (or equivalently in the saddle-point equations 
for the parameters). The first approach has the advan- 
tage that we can study a as a function of (q^r). But, 
since we are only interested in the optimal capacity, we 
opt for the asymptotic expansion. Since the limits g — > 1 
and r — > 1 must be taken simultaneously, we introduce a 



factor 7 such that (1 — r) = 7(1 — q). Then, a simple in- 
spection of the function 157|) appearing in the expression 
of the entropy H55|l suggests that in the limit q ^ 1 this 
function will diverge as (1 — g)~^. Since this function is 
coupled to the capacity and we expect non-trivial results, 
the other terms in the entropy also have to diverge in such 
a way. This implies, for instance, that for the function 
[l]|(a;,?/) the terms q/ {E — q), -0 and r/{F—r) appearing 
in its argument have to go to infinity as (1 — q)^^- The 
precise coefficients in front of this divergence are given 
by the saddle-point equations of the conjugated order- 
parameters. Performimg this asymptotic expansion ex- 
plicitly leads to the result 1201) in Section ITTll 
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